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The notion of forgetfulness, used in discrete quantum memory channels, is slightly weakened in order to 
be applied to the case of continuous channels. This is done in the context of quantum memory channels with 
Markovian noise. As a case study, we apply the notion of weak-forgetfulness to a bosonic memory channel with 
additive noise. A suitable encoding and decoding unitary transformation allows us to unravel the effects of the 
memory, hence the channel capacities can be computed using known results from the memoryless setting. 
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I. INTRODUCTION 

One of the main issues in quantum information theory is 
the evaluation of the maximum rate, i.e. the capacity, at which 
(classical or quantum) information can be reliably transmitted 
via a quantum communication channel. When studying mod- 
els of noisy quantum communication, a common assumption 
is that the noise affecting the channel is identical and inde- 
pendent at each channel use. In mathematical terms, the com- 
pletely positive trace-preserving (CPT) map £W describing n 
uses of the quantum channel is the direct product of n identi- 
cal copies: 

n 

£ (n) = (g)£, (l) 

where £ is the CPT map describing a single use of the quan- 
tum channel. A channel of this kind is called, as its classi- 
cal counterpart, a memoryless quantum channel. Coding the- 
orems for memoryless quantum channels, allowing to write 
the channel capacities in terms of entropic quantities, are well 
established results in quantum information theory 0]] . How- 
ever, the assumption of independent and identical noise can be 
rather artificial in several physical settings where memory ef- 
fects may naturally appear, see e.g. 12[] and references therein. 
This observation leads to consider quantum channels with a 
more general structure than the simple tensor-product struc- 
ture of ([T). Every quantum channel such that 

n 

£ (n) ^(g)£ (2) 

j=i 

is called a quantum channel with memory, or simply a memory 
channel. For memory channels the noises affecting multiple 
channel uses are in general neither independent nor identical. 

The structure theorem for memory channels was provided 
in J2]. Under the assumptions of causality and invariance un- 
der time translation, a sequence of n uses of a memory chan- 
nel can be always decomposed as the n-fold concatenation 
S(") of an elementary transformation S. Such decomposition 
requires the introduction of an ancillary system Ai, called the 
memory kernel (or simply the memory), which accounts for 
correlations. Such elementary transformation has two input 
and two output systems. In Fig.Q]the horizontal lines indicates 
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FIG. 1: On the left: each use of the memory channel is represented 
by an elementary transformation S with two input systems A and 
A4 and two outputs B and A4. On the right: n uses of the memory 
channel are represented as the n-fold concatenation of the elementary 
transformation. 



the sender (A) and the receiver (£>) systems, the vertical line 
the input and output memory. Multiple uses of the memory 
channel are hence obtained by concatenating the elementary 
transformation through the vertical line, as shown in the right 
hand side of Fig. [I] 

The performances of the memory channel are in general de- 
termined by the memory initialization. Different initial states 
of the memory kernel can lead to different values of the chan- 
nel capacities. This is not the case for "forgetful" channels, 
whose capacities are independent on the memory initializa- 
tion. Moreover, coding theorems for forgetful channels are 
straightforward extensions of their memoryless counterparts. 
The behavior of forgetful channels is asymptotically indepen- 
dent on the memory initialization, hence the memory system, 
after a sufficiently large number of channel uses, "forgets" 
what was its initial state. This property was put forward in 
Oh. Then, the notion of "forgetfulness' in discrete quantum 
channels, i.e. CPT map acting on finite dimensional Hilbert 
spaces, has been formalized in |2|]. 

Recently, in the framework of continuous memory channel, 
i.e. CPT map acting on infinite dimensional Hilbert spaces, it 
has been noticed that the extension of the notion of forgetful- 
ness to this framework is highly nontrivial J3]. 

Below, this notion will be slightly weakened and extended 
to the case of continuous quantum channels by considering 
Markovian noise. As an application we shall evaluate the 
classical capacity of a bosonic memory channel with additive 
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The paper develops along the following lines. In Sec. [TT] 
the notion of forgetfulness will be considered in the context 
of quantum channels with Markovian correlated noise and 
adapted to the continuous variable setting. We introduce a 
notion of "weak-forgetfulness" to be applied in the case of a 
Markov process with continuous noise variable. In Sec. [Ill] 
a model of quantum channel subjected to additive Gaussian 
noise with Markovian correlations will be proposed. Suit- 
able encoding and decoding unitary transformations allow us 
to unravel the effects of the memory, hence the channel capac- 
ities can be computed using known results from the memory- 
less setting. 



II. QUANTUM CHANNELS WITH MARKOVIAN 
CORRELATED NOISE 

In this section we consider the notion of forgetfulness ap- 
plied to the case of a class of quantum memory channels with 
Markovian correlated noise. 

Let us first recall the definition of forgetfulness as presented 
inH. 

Definition 1 (Forgetfulness) A memory channel is forgetful 
iff for any e > 0, there exists an integer v such that for any 

n > v 



Tr B 



S {n) ( P l.A,M) 



S {n) { P 2,A. M ) 



<e, (3) 



for any pi,A,M> P2.A,M states of the n inputs and the initial 
memory such that 



l,A,M) 



Trx(P: 



2,A,M 



(4) 



This definition of forgetfulness applies in the Schroedinger 
picture description of the memory channel, an equivalent def- 
inition can be formulated in the Heisenberg picture. Let us 
briefly comment it. The density operators P\,a,m, P2,a,m 
describe two input states of the n-fold concatenation S^, in- 
cluding the initial state of the memory kernel M. and the state 
on the n channel inputs belonging to the sender A. Equa- 
tion (01 states that Pi,a.m an d P2,a,m only differ for the 
reduced state of memory kernel, corresponding to two dif- 
ferent memory initializations. In Eq. (0), the partial traces 
Tr B [SW{ Pl , AtM )], Tr B [S^ { P 2,A,m)] > over the n output 
of the channel belonging to the receiver B, indicate the final 
states of the memory kernel after n channel uses. Hence, after 
n > v uses of a forgetful channel the final state of the memory 
kernel can be assumed to be independent on the memory ini- 
tialization with an error smaller than e, where v is only deter- 
mined by the error threshold e, uniformly for all initial states 
of the memory kernel. The trace distance 



- P2II1 := Tr(|pi - P2I 



(5) 



is used to quantify the distance between the final states of the 
memory kernel. 

If the memory channel is forgetful, one can adopt a double- 
block encoding. Over m =n + l channel uses, the first n > v 



are not used to send information, but only to let the memory 
kernel forget its initial state with an error smaller than e, then 
the remaining I are used to send information to the channel. 
This double blocking procedure allows to prove the coding 
theorem for forgetful channels. 

Quantum channels with Markovian correlated noise were 
first considered in 0. Here we are going to consider such 
channels characterized by a memory system represented by a 
classical random variable Z taking values z £ in a measur- 
able set O. 

At the kth use of the channel an input state P a maps to an 
output state 



PB 



dzP k {z)£ z (pA) ■ 



(6) 



where P k (z) is the probability distribution of random variable 
Z at step k and £ z is a CPT map for any z <G fi. The probabil- 
ity distribution of the noise variable changes according to the 
Markov rule 



P k+ i(z) = / dz'w{z\z')P k {z'). 



(7) 



in which u> (z \ z') is the transition function determining the sta- 
tionary Markov process. The model studied in |6] belongs to 
this class of memory channels. Coding theorems for this class 
of memory channels were provided in |0] in the case z is a 
discrete variable. 

Recalling that in the forgetful channel, the final state of the 
memory is independent of the initial memory, one would say 
that a quantum memory channel with Markovian correlated 
noise is forgetful if and only if the Markov process of the en- 
vironment has a unique stationary state. This is indeed the 
case for quantum channels acting on discrete variable quan- 
tum systems. From this intuition we are led to introduce the 
notion of "weak-forgetfulness" to include the case of infinite 
of infinite dimensional quantum memory channels. Consider- 
ing this we define weak-forgetful channels as follows: 

Definition 2 (Weak-Forgetfulness) A memory channel with 
Markovian correlated noise is "weak-forgetful" iff for any e > 
0, and for any pair of initial probability distributions P\(z), 
P[{z), there exists an integer v such that for any n > v 



\P n -K\\<e. 



where 



\Pn-PL\ 



dz\P n (z)-P^(z)\, 



(8) 



(9) 



is the distance between the probability distributions at step n 
with two different initial probability distributions P\(z) and 
P{(z). 

/n—l 
Y[ dz k w{z\z n ^) ■ ■ ■ w{z 2 \z 1 )P 1 (z 1 ) , (10) 
fe=i 

K{?) = J \{dz k w{z\z n . l )---w{z 2 \z l )P[{z 1 ). (11) 



k=l 
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Hence we can say that, even in the case of continuous vari- 
ables, a memory channel with Markovian correlated noise is 
weak-forgetful iff the underlying Markov process has unique 
stationary state. 

To adopt a double block procedure one should wait for 
?i > v channel uses in order to let the noise process ap- 
proaches the stationary state and then start encoding informa- 
tion. It is worth to mention that "weak-forgetfulness" differs 
from "forgetfulness" property in the sense that the noise prob- 
ability distributions converge not uniformly with respect to the 
initial distributions P 1; P[. Some examples will be discussed 
in the next section. In conclusion, the notion of forgetfulness 
and weak-forgetfulness clearly coincide if the set O in which 
the noise variable takes values is compact. This is the case of 
discrete random variable studied in 0]. 



III. ADDITIVE GAUSSIAN NOISE 

In this section we consider a model of bosonic memory 
channel with Markovian correlated noise. The notion of 
weak-forgetfulness is applied to this model. 

The model under consideration is a bosonic channel with 
additive noise. A sequence of n uses of the memory chan- 
nel maps n input bosonic modes, with ladder operators 
{a k , a) k }k=i.... n , onto ?i output modes, described by the op- 
erators {6fc,6fc}fc=i,...„. 

In the Heisenberg picture, the mode operators are trans- 
formed as follows: 



b k = a k + z k , bl 



(12) 



where z k G C is the value of the random variable Z at the fcth 
step. 

(n) 

In the Schroedinger picture, a density operator p A ' describ- 
ing the state of the n input modes is subjected to a random 
displacement, i.e. 



(n) 

Pb = 



IF 



Zk 



lk=l 



P(z 1 ,z 2 ,...z n )x 



and only if the joint probability distribution is the product of 
7i identical distributions: 



P(zi,z 2 , ■ 



P{z k ). 



(15) 



k=l 



A remarkable case is obtained if the noise variables come 
from a time-independent Markov process. In this case the 
quantum channel satisfies the conditions of causality and in- 
variance under time translations and the structure theorem can 
be applied. The joint probability distribution reads 

P(zi, z 2 , ■ ■ • z n ) = cj(z„|z„_i) . . .u{z 2 \z 1 )P 1 (z 1 ) , (16) 

where ui(z k \z k ^i) is the transition function determining the 
Markov chain and P\{z\) is the initial probability distribution 
describing the noise variable at the first channel use. 

In order to construct a Gaussian channel, one has to con- 
sider a Gaussian stochastic process. Here we consider a Gaus- 
sian transition function of the form: 



w(z fe |z fc _i) ~ exp 



\z k - /-^fc-ip 
(l-/i> 



(17) 



Here and in the following we omit writing the normalization 
factor in front of the probability density distributions. 

The memory channel is hence described by two parameters. 
The parameter \i G [0,1] accounts for the memory effects, and 
cr > 0, as it will be made clear below, to the amount of noise 
in the channel. The memoryless limit is recovered for fi = 0, 
in which case the joint probability distribution factorizes as in 

ins. 

The features of the memory channel depends on the under- 
lying Markov process. Inserting (fTTI i into ( fTOb we obtain 



where a n = a(l — fi 2n ). By considering the limit n 
distinguish the following cases. 



(18) 



oo we 



X 


n 

§§V k {z k ) 


(n) 

Pa 


n 

<g)v k { Zk y 


, (13) 


A. Noise process at the stationary state 




_fc=l 




_k=i 







where T> k {z k ) is the displacement operator acting on the fcth 
input mode, and P{z\, z 2 , . . . z n ) is the joint probability dis- 
tribution of the n noise variables. 

The quantum channel is Gaussian if and only if the proba- 
bility distribution of the noise is Gaussian. 

Our aim is to compute the capacity of the quantum channel. 
In order to avoid unphysical results, we impose a constraint 
on the maximum energy at the input modes by the following 
condition: 



For fi G [0, 1[ and a > there exists an unique stationary 
distribution 



P s (z) ~ exp 



(19) 



1, 



-Tr 



P ( A a l Gk 
K fe=l / 



< N . 



(14) 



Hence, for these values of the parameters, the memory chan- 
nel is weak-forgetful. Notice that the parameter a is the noise 
variance of the stationary distribution. 

Considering the stationary state of the Markov process is 
hence sufficient for computing the channel capacities. Upon n 
channel uses the stationary process is described by a Gaussian 
joint probability density distribution 



The memoryless limit is recovered iff the noise variables 
are mutually independent and identically distributed, i.e. if 



P(z 1 , . . .z n ) ~ exp 



T,hk z* h M hk z k 



(1 



(20) 
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where M is the n x n tridiagonal matrix: 





M 



( 1 -n 

-fi 1 + fi 2 —fx 
— /i 1 + 



2^ 

o 





V o 



— 1 + /i 2 — /i 

o o -n i y 



(21) 



For any n, the quadratic form appearing in ( f20b can be al- 
ways put in a diagonal form 



Y z* h M hk z k = Y mj\zj\ 2 



(22) 



hk j 

in terms of the collective noise variables 

~ z i : = H °ikZk , 5* := ^ O jk z* k , 



(23) 



where O is the n x n orthogonal matrix diagonalizing M: 



]h M hk 3 > k = Sjj'rrij . 



(24) 



By applying a unitary encoding and decoding transforma- 
tions, we can analogously define the collective input variables 



k 

and output variables 

bj := Y °J kbk ' 
k 

which transform according to 

bj = a, + Za , 



53 



53 



(25) 



(26) 



(27) 



Hence, n uses of the memory channel are unitary equiva- 
lent to the tensor product of n additive noise channels, whose 
noise variables are mutually independent but not identically 
distributed. From (l22l , the collective noise variables are dis- 
tributed according to the Gaussian distributions 



where the noise variances are 



(28) 



(29) 



For any n, the distribution of the noise variances can be com- 
puted from the eigenvalues of the matrix M. Notice that the 
energy constrain is preserved in terms of the collective input 
variables, i.e. 



(30) 



In the limit of n — > oo, the distribution of the eigenvalues 
of the matrix M, arranged in nondecreasing order, tends to an 
asymptotic distribution, described by the function 

m°°(A) = |1 -fie lX \ 2 . (31) 

for A e [0, 7r], in the sense that JH: 

lim - V \rrij - m°°(irj/n)\ = . (32) 



Analogously, for rrij, m°°(A) > 0, we have 

lim - V \d.j - a(irj/n)\ = , 

n— >oo ft — ' 



(33) 



where the asymptotic distribution of the noise variances, ar- 
ranged in nonincreasing order, is 



CT ( A ) = G T, ~7Zi\\2 



(34) 



|1 - /J,e l 

As consequence of (l33l , for any smooth function F, the 
following equality holds true 

lim - Y2 F &)= / ~ (35) 



Classical capacity 

The additive noise channel has been widely studied in the 
memoryless, Gaussian case. We recall the case of the mem- 
oryless broadband channel. At each use of the channel, J 
input modes {a K , a^}«=i,...j are subject to independent, but 
not identically distributed, Gaussian additive noises with vari- 
ances a K . A lower bound on the classical capacity can be ob- 
tained optimizing over Gaussian encoding. Moreover, using 
the recently proven minimum output entropy conjecture, it is 
possible to show that the classical capacity of the broadband 
channel, per mode and expressed in bits, is 



C 



g(N K + <r K ) - g(a K ) 



where g{x) := (x + 1) log 2 (x + 1) — x log 2 (x) and 

1 



A, 



2 L - 1 



(36) 



(37) 



where (x) + equals x if x > and is zero otherwise. The 
value of the Lagrange multiplier L is the root of the integral 
equation 



1 J 



1 



2 L - 1 



N . 



(38) 



Using the result for the memoryless broadband channel we 
can now compute the classical capacity of the memory chan- 
nel, in the region ijl £ [0, 1[, a > 0, by following the same 
line of reasoning of @]. 
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For any n, we can group the set of collective modes in J 
blocks of length I = n/ J. At the boundaries of the Kth block 
the maximum and minimum limits of the effective noise vari- 
ances are 



a K := limsup<7 (K _ 1) „ / , 7+1 , a K := liminf a Kn/J . (39) 

n — >oo n 



Recalling that the noise variances cr, are in nonincreasing or- 
der, it follows that for arbitrary S > and for sufficiently large 



5 < a 



(40) 



for any k and j = 1 , . . . I. 

From the last equation it follows that the classical capac- 
ity of the memory channel is bounded from above and from 
below by the capacity of two memoryless broadband chan- 
nels, respectively characterized by the set of J noise variances 
{o-k - 5} K =i,...j and {a K + 5} K=1 ....j. 

Now, keeping J fixed and in the limit I — ► oo, we can write 
the following bounds for the classical capacity: 



where 



a 



Cj 



Cj<C< Cj , 

1 J 

7^.9(iV K +a K )- 5 (a K ) 
J «=i 

1 3 — 

-Y^g{N K + a K )-g{a K ) 



J 



(41) 

(42) 
(43) 



and the optimal distribution N_ K , N K are as in Eq.s d37l i. d38l l. 

Finally, in the limit J — ► oo the lower and upper bound 
coincide. Using d35l ) that yields the following formula for the 
classical capacity: 



C = 



v dz 



- g [N(z) + a(z)]-g[a(z)}, 



(44) 



where the function N(z) is determined according to the con- 
tinuous limit of Eq.s rt37| >, ( |3~8] >, i.e. 



N(z) 
N 



1 



2 L -1 
* dz 



-a(z) 
1 

2 L - 1 



a(z) 



(45) 
(46) 



The formulas (l44l . (l45b . d46b can be used to numerically 
compute the classical capacity of the memory channel in the 
region /i 6 [0, 1 [ and a > 0. The numerical results are plotted 
in Fig. |2 We remark that, although we have assumed the noise 
process to be at the stationary state, since the memory channel 
is weak-forgetful, the obtained result is the classical capacity 
for all the initial states of the memory. 




FIG. 2: (Color online.) The density plot shows the classical capacity 
of the Markovian correlated additive noise channel as function of the 
parameters a and \i. The maximum value of the number of excitation 
per mode is TV = 8, corresponding to a noiseless channel classical 
capacity g(N) ~ 4.5293. 



B. Critical behavior 

Some care is needed in dealing with the parameter regions 
defined by a = and fj, G [0, 1[, and defined by fj, = 1. For 
these values of the parameters the transition functions become 
singular. 



lim u(zk\zk-i) 

<T—>0,p,<l 



S(zk — fiZk-i) , (47) 
(48) 



lim w(z fc |z fe _i) = S(zk-Zk-i). 

In the limit a — > 0, an unique stationary state exists al- 
though singular, i.e. P s (z) = S(z). We can still say that the 
channel is weak-forgetful. By noticing that the stationary state 
of the Markov process corresponds to a noiseless channel, we 
can say that for a = the classical capacity of the memory 
channel is given by the noiseless channel formula C = g(N). 

In the limit /j, — ► 1, the Dirac ^-function in ( f4~8b implies that 
the noise acting at different channel uses are perfectly corre- 
lated. It is immediate to recognize that in this case the Markov 
process has infinitely many stationary states. The channel has 
hence long-term memory and is not weak-forgetful. Thus we 
cannot say a priori that the channel capacity is independent 
on the memory initialization. However, we can still solve the 
channel by proceeding as follows. Upon n channel uses the 
corresponding joint probability distribution of the noise vari- 
ables reads as follows 



P(zi, ...z n ) = S(z n 



i)---d(z 2 -z 1 )P 1 (z 1 ), (49) 



where Pi(zi) is the initial noise distribution. For a generic 
initial noise distribution, even a nonGaussian one, we can 
solve the problem of the channel capacity by introducing suit- 
able encoding/decoding unitary transformations which allow 
to unravel the memory. For any n, we can define the collective 
noise variable 



1 



k=l 



Zk 



(50) 
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together with a set of n — 1 variables 

1 " / i — 1 \ 

z; := —= exp ( i2n k ) z k , for j = 2, . . . n . (51) 

Vnf^ V n J 

In terms of these collective noise variables, the joint probabil- 
ity distribution ( |49l l factorizes as follows: 

P(z u . . . 2„_i) = Pi(zi)S(z2) ■ ■ ■ S(z n ) . (52) 
Hence, introducing the collective input and output variables 

Si := -^=y^a k , a 3 ■ := —= Y^exp ( t2^- -k J a k , 

1™ 1™ / ' — l \ 
h := —7= V" b k , bj := —= V* exp i2tt k I b k , 

it follows that the collective mode { a\ , a\ } is subject to the ad- 
ditive noise described by the initial noise probability P 1 (z 1 ), 
while the remaining n—1 collective modes experience a noise- 
less channel. In conclusion, taking the limit n — > oo and inde- 
pendently of the initial noise distribution, the classical capac- 
ity of the memory channel is given by the noiseless formula 
C = g(N). 

It is worth noticing that the classical capacity at the sin- 
gular region coincides with the analytical continuation of the 
expression in Eq. d44b . 

To conclude this section we notice that the stationary Gaus- 
sian process discussed in the previous subsection can be 
mapped into the 'Gaussian model' discussed in JjJ. In this 
mapping, the point p, = 1, which gives rise to a channel 
with long-term memory, corresponds to the critical point of 
the Gaussian model. 



IV. CONCLUSIONS 

In conclusion we have considered the notion of forgetful- 
ness for memory channels with Markovian correlated noise. 
For the case of a Markov process with discrete noise vari- 
ables forgetfulness is equivalent to the existence of unique 



stationary noise distribution. In the case of continuous vari- 
able Markov process, we have introduced a notion of weak- 
forgetfulness. A memory channel with continuous variable 
Markovian correlated noise is weak-forgetful iff the noise pro- 
cess has unique stationary distribution. Moreover the capac- 
ities are independent of the memory initialization. The no- 
tion of forgetfulness and weak-forgetfulness are equivalent 
in the discrete variables setting. As an application, we have 
proposed a model of bosonic Gaussian channel with additive 
Markovian correlated noise and computed the classical capac- 
ity. The channel is either weak-forgetful or has long-term 
memory. In all the cases the classical capacity has been be 
computed exactly (Fig.[2]summarizes the obtained results). 

It is worth noticing that the capacity is reached without the 
use of entangled codewords. This can be easily proven by 
noticing that the encoding transformation in Eq. dZBT l trans- 
forms coherent states into coherent states, and recalling that 
coherent state encoding is optimal to reach the memoryless 
classical capacity in Eq. ( |36l l. This is related to the fact that 
the considered channel model is covariant under gauge trans- 
formations a k — * e L ^a k . Entangled codewords would be nec- 
essary if one consider a noise process which breaks this sym- 
metry, see e.g. IToll . 

Other kinds of capacities can be computed along the same 
line of reasoning for the considered model. Furthermore, by 
exploiting the recently proven minimum output entropy con- 
jecture 11 ill , the same methods can be applied to determine 
the capacities of other bosonic channels, e.g. attenuation and 
amplification channels, with Markovian noise. 
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